fix: handle empty JSON-LD framing responses and prevent log injection by kaiprodevops · Pull Request #40 · eclipse-thingweb/domus-tdd-api

kaiprodevops · 2026-04-09T20:13:45Z

Summary

Fixes uncaught JSONDecodeError when the JSON-LD framing process fails or returns empty output, and prevents log injection vulnerabilities in error handling.

Problem

When frame-jsonld.js returns an empty string (e.g., due to unreachable custom context URLs or malformed N-Triples), the call to json.loads(jsonld_compacted) in frame_td_nt_content() raises an uncaught json.decoder.JSONDecodeError. Flask's default error handler converts this to an unstructured HTTP 500 Internal Server Error with no meaningful message.

This affects any GET /things/{id} request for a Thing Description whose custom context URL was unreachable at ingestion or framing time.

Root Cause

No validation of subprocess output before JSON parsing
No exit code checking for the Node.js framing process
Potential for log injection via unsanitized stderr output
Information leakage through detailed error messages in HTTP responses
Risk of pipe buffer deadlock using write()/read() pattern

Solution

1. Improved Subprocess Communication (`tdd/common.py`)

Use communicate() instead of write()/read() to prevent OS pipe buffer deadlocks (Python official recommendation)
Check subprocess exit code (returncode) for reliable error detection
Sanitize stderr output using repr() and truncate to 200 chars to prevent log injection attacks
Return generic error messages to clients while logging detailed errors server-side

2. Enhanced Error Handling (`tdd/td.py`)

Validate framing output before JSON parsing (check for empty/whitespace-only strings)
Catch JSONDecodeError and convert to application-specific exception
Log sanitized error details without exposing them to clients

3. JavaScript Error Handling (`js-src/frame-jsonld.js`)

Add try-catch block around framing logic
Output errors to stderr for proper logging
Exit with non-zero status code (1) on failure

4. New Exception Type (`tdd/errors.py`)

Introduce ExternalDependencyError with HTTP 502 status code
Semantically correct for upstream dependency failures (context URLs, Node.js process)
Supports multilingual error messages (EN, FR, DE)

Security Improvements

Log Injection Prevention

Before (vulnerable):

logging.error(f"frame-jsonld.js failed: {error_message}")  # Newlines can inject fake log entries

After (secure):

sanitized_stderr = repr(stderr_data.strip()[:200])
logging.error(f"frame-jsonld.js failed with exit code {p.returncode}: {sanitized_stderr}")

Information Leakage Prevention

Before (exposes internals):

{
  "detail": "Framing process failed: Error: Cannot resolve context https://attacker.com/probe",
  "status": 502
}

After (generic):

{
  "title": "External Dependency Error",
  "detail": "An upstream dependency failed to process the data. JSON-LD framing process failed.",
  "status": 502
}

Changes

Modified Files

tdd/errors.py - Add ExternalDependencyError exception class
tdd/common.py - Improve frame_nt_content() with secure subprocess handling
tdd/td.py - Improve frame_td_nt_content() with input validation and error handling
js-src/frame-jsonld.js - Add error handling and non-zero exit codes

New Files

tdd/tests/test_frame_error_handling.py - Comprehensive test suite (6 test cases)

Testing

Test Coverage

✅ Subprocess failure with non-zero exit code
✅ Empty response from framing engine
✅ Malformed JSON output
✅ Valid response processing
✅ Subprocess success scenario
✅ Log injection attack prevention

Test Results

85 passed, 680 warnings in 10.65s

All existing tests pass, confirming backward compatibility.

Related Issues

Fixes #39

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

wiresio · 2026-04-10T15:21:00Z

Thanks @kaiprodevops for tackling this issue. Here is Claude's feedback:

PR Review: fix/uncaught-jsondecodeerror-502

✅ What's Correct

communicate() replaces write()/read() — correct. The original pattern can deadlock when the subprocess output fills the OS pipe buffer before p.wait() is called. Using communicate() is the documented Python idiom.

Exit code checking — correct. The Node.js try-catch now ensures non-zero exit on any failure, and common.py checks returncode != 0.

ExternalDependencyError integrates automatically — ExternalDependencyError inherits from AppException, so the existing @app.errorhandler(AppException) in init.py handles it without any changes needed there. The 502 status code is semantically appropriate.

Defense-in-depth layering — the empty-string check and JSONDecodeError catch in td.py are valid secondary guards for the case where common.py doesn't raise (e.g., returncode 0 but empty stdout).

JS fix is correct — the try-catch handles both synchronous errors (bad JSON input) and async rejections (framing fails). process.exit(1) ensures a non-zero exit code is propagated.

❌ Bugs / Deficiencies

1. Tests 1, 5, 6 fail without built webpack artifacts (real bug)

Tests that mock subprocess.Popen fail because resources.path("tdd.lib", "frame-jsonld.js") is called before Popen and throws FileNotFoundError when tdd/lib/frame-jsonld.js doesn't exist (it must be built with npm run build). Confirmed locally:
FAILED test_frame_nt_content_subprocess_failure
FAILED test_frame_nt_content_subprocess_success
FAILED test_frame_nt_content_log_injection_prevention

FileNotFoundError: [Errno 2] No such file or directory: '.../tdd/lib/frame-jsonld.js'
The tests need to also mock resources.path in tdd.common, e.g.:
@patch("tdd.common.resources.path")
@patch("tdd.common.subprocess.Popen")
def test_frame_nt_content_subprocess_failure(mock_popen, mock_resources_path):
    mock_resources_path.return_value.__enter__ = lambda s: "fake/path"
    mock_resources_path.return_value.__exit__ = Mock(return_value=False)
    ...
Note: The existing test_GET_thing_OK has the same environment dependency, so this is partially pre-existing — but the PR introduces three new tests with the same flaw.

2. td_id not sanitized in td.py log message (inconsistency)
# tdd/td.py
logging.error(f"JSON decode failed for TD {td_id}: {repr(str(exc)[:100])}")
td_id comes from the URL path and could contain newlines, injecting fake log entries — the same attack the PR claims to prevent in common.py. Should be repr(str(td_id)[:200]).

⚠️ Minor Issues

3. Test 6 positive assertion is trivially true
assert "\\n" in logged_message or "'" in logged_message
repr() always wraps strings in single quotes, so "'" in logged_message is always True. The or makes the whole assertion trivially pass even if \n wasn't escaped. The meaningful assertion is the one that follows:
assert "\nFAKE LOG ENTRY" not in logged_message  # this is fine
The positive assertion should be strengthened to just assert "\\n" in logged_message.

4. details appears in English inside FR/DE messages
message_fr=f"Une dépendance en amont n'a pas pu traiter les données. {details}",
details (e.g., "JSON-LD framing process failed.") is always English, appearing verbatim in French and German responses. Minor UX inconsistency.

5. finally { rl.close() } misleads about error-path behavior

In Node.js, process.exit(1) immediately terminates the process and does not execute the finally block. So rl.close() is only reachable in the success path — the same behavior as the original placement. This is not a bug (exiting the process anyway), but is potentially confusing.

🔍 Missing Test Coverage

No integration test verifies the full HTTP lifecycle: that when framing fails, the endpoint returns:

Status 502

Content-Type: application/problem+json

{"title": "External Dependency Error", "status": 502, ...}

The existing test infrastructure in conftest.py (with test_client and httpx_mock) supports writing such a test.

Summary

Area Verdict

Core deadlock fix (communicate()) ✅ Correct

Exit code checking ✅ Correct

Flask error handler integration ✅ No changes needed

JS try-catch + process.exit(1) ✅ Correct

Log injection prevention in common.py ✅ Correct

Tests 1/5/6 without built assets ❌ Broken

td_id sanitization in log ❌ Missing

Test 6 assertion strength ⚠️ Trivially true

FR/DE details translation ⚠️ English only

HTTP 502 integration test ⚠️ Missing

wiresio · 2026-04-10T15:23:28Z

How to fix the tests:

npm run build would produce exactly tdd/lib/frame-jsonld.js (confirmed from webpack config: output goes to tdd/lib/[name].js). The failing tests would then pass, because resources.path("tdd.lib", "frame-jsonld.js") would resolve successfully and the Popen mock would be reached.

Two changes would be needed:

1. Run build before tests — tox.ini

Add a commands_pre directive to the [testenv] that runs the JS build before pytest:
 [testenv]
 deps =
     pytest
 extras = dev
+commands_pre =
+  npm install --prefer-offline
+  npm run build
 commands =
   {envpython} -m pytest {posargs:tdd/tests}
--prefer-offline avoids re-downloading packages if node_modules is already present. If CI always starts clean, use plain npm install.

2. (Optional but cleaner) Add a pytest session-scoped fixture that guards the build — conftest.py

Alternatively, add a session fixture that checks for the built file and gives a clear error if the build is missing, rather than a cryptic FileNotFoundError at test collection time:
# in tdd/tests/conftest.py
import subprocess
from pathlib import Path

@pytest.fixture(scope="session", autouse=True)
def ensure_js_build():
    lib_path = Path(__file__).parent.parent / "lib" / "frame-jsonld.js"
    if not lib_path.exists():
        pytest.exit(
            "tdd/lib/frame-jsonld.js not found. Run 'npm install && npm run build' first.",
            returncode=1,
        )
Which approach to pick

Approach Pros Cons

tox.ini commands_pre Fully automated, CI-safe Requires Node/npm in the test environment

conftest.py guard Clear error message for dev Doesn't fix the issue, just reports it better

Both together is the complete solution: tox.ini ensures the build runs automatically, and the conftest.py guard gives a human-readable error if someone runs pytest directly without the tox wrapper.

The tox.ini change is the only strictly necessary change to make Tests 1/5/6 pass in CI without switching to mocking resources.path.

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

kaiprodevops · 2026-04-14T03:46:01Z

Thanks @wiresio for the thorough review and detailed feedback! I've addressed all the issues you identified. Please review the changes and let me know if any adjustments are needed.

wiresio · 2026-04-15T12:23:18Z

Thanks @kaiprodevops!

kaiprodevops · 2026-04-15T15:45:09Z

You're very welcome! @wiresio Thanks again for the excellent guidance during the code review.

fix: handle empty JSON-LD framing responses and prevent log injection

ea51d3a

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

kaiprodevops added 5 commits April 13, 2026 22:45

chore(deps): apply npm audit fix to resolve security vulnerabilities

71eb1fe

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

fix: handle json decode error properly and return HTTP 502

b65efb0

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

ci: setup Node.js in lint job and use npm ci for reproducible builds

f57fca7

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

ci: fix linting and formatting

593c892

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

Remove the unused httpx import

af662de

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>

wiresio merged commit 3ca1b87 into eclipse-thingweb:main Apr 15, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle empty JSON-LD framing responses and prevent log injection#40

fix: handle empty JSON-LD framing responses and prevent log injection#40
wiresio merged 6 commits intoeclipse-thingweb:mainfrom
kaiprodevops:fix/uncaught-jsondecodeerror-502

kaiprodevops commented Apr 9, 2026

Uh oh!

wiresio commented Apr 10, 2026

PR Review: `fix/uncaught-jsondecodeerror-502`

✅ What's Correct

❌ Bugs / Deficiencies

1. Tests 1, 5, 6 fail without built webpack artifacts (real bug)

2. `td_id` not sanitized in td.py log message (inconsistency)

⚠️ Minor Issues

3. Test 6 positive assertion is trivially true

4. `details` appears in English inside FR/DE messages

5. `finally { rl.close() }` misleads about error-path behavior

🔍 Missing Test Coverage

Summary

Uh oh!

wiresio commented Apr 10, 2026

1. Run build before tests — tox.ini

2. (Optional but cleaner) Add a `pytest` session-scoped fixture that guards the build — conftest.py

Which approach to pick

Uh oh!

kaiprodevops commented Apr 14, 2026

Uh oh!

Uh oh!

wiresio commented Apr 15, 2026

Uh oh!

kaiprodevops commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kaiprodevops commented Apr 9, 2026

Summary

Problem

Root Cause

Solution

1. Improved Subprocess Communication (tdd/common.py)

2. Enhanced Error Handling (tdd/td.py)

3. JavaScript Error Handling (js-src/frame-jsonld.js)

4. New Exception Type (tdd/errors.py)

Security Improvements

Log Injection Prevention

Information Leakage Prevention

Changes

Modified Files

New Files

Testing

Test Coverage

Test Results

Related Issues

Uh oh!

wiresio commented Apr 10, 2026

PR Review: fix/uncaught-jsondecodeerror-502

✅ What's Correct

❌ Bugs / Deficiencies

1. Tests 1, 5, 6 fail without built webpack artifacts (real bug)

2. td_id not sanitized in td.py log message (inconsistency)

⚠️ Minor Issues

3. Test 6 positive assertion is trivially true

4. details appears in English inside FR/DE messages

5. finally { rl.close() } misleads about error-path behavior

🔍 Missing Test Coverage

Summary

Uh oh!

wiresio commented Apr 10, 2026

1. Run build before tests — tox.ini

2. (Optional but cleaner) Add a pytest session-scoped fixture that guards the build — conftest.py

Which approach to pick

Uh oh!

kaiprodevops commented Apr 14, 2026

Uh oh!

Uh oh!

wiresio commented Apr 15, 2026

Uh oh!

kaiprodevops commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Improved Subprocess Communication (`tdd/common.py`)

2. Enhanced Error Handling (`tdd/td.py`)

3. JavaScript Error Handling (`js-src/frame-jsonld.js`)

4. New Exception Type (`tdd/errors.py`)

PR Review: `fix/uncaught-jsondecodeerror-502`

2. `td_id` not sanitized in td.py log message (inconsistency)

4. `details` appears in English inside FR/DE messages

5. `finally { rl.close() }` misleads about error-path behavior

2. (Optional but cleaner) Add a `pytest` session-scoped fixture that guards the build — conftest.py